workshop project

Let’s create a new Rstudio project in which to work with:

File > New Project > New Directory > New Project > “Practical Data Management”

In our new project, let’s create a data/ folder in which to store the data.

dir.create("data")

Let’s also open a new R script in which to work:

File > New File > R Script

Save it in the project root, eg as metadata_dev.R


Packages

Let’s install and load all the packages we’ll need for the workshop:

install.packages("tidyverse")
install.packages("here")
install.packages("devtools")
devtools::install_github("ropenscilabs/dataspice")
library(tidyverse)
library(dataspice)



Data

For more information on the data source check the tutorial README

Get data

The readr::read_csv() allows use to download raw csv data from a URL.

vst_mappingandtagging <- read_csv("https://raw.githubusercontent.com/annakrystalli/dataspice-tutorial/master/data/vst_mappingandtagging.csv")

vst_perplotperyear <- read_csv("https://raw.githubusercontent.com/annakrystalli/dataspice-tutorial/master/data/vst_perplotperyear.csv")

Inspect data

You can inspect any object in your environment in Rstudio using function View()

vst_mappingandtagging %>% View()
vst_perplotperyear %>% View()

Save data

write_csv(vst_mappingandtagging, here::here("data", "vst_mappingandtagging.csv"))
write_csv(vst_perplotperyear, here::here("data", "vst_perplotperyear.csv"))

Create metadata files

create_spice()

This creates by default a metadata folder in your project’s data folder (although you can specify a different directory) containing 4 files in which to record your metadata.


Record metadata

creators

Let’s start with a quick and easy one, the creators. We can open and edit the file using in an interactive shiny app using edit_creators

edit_creators()

Remember to click on Save when you’re done editing.

access

Before manually completing any details we can use dataspice’s dedicated function prep_access() to extract information required for the access.csv

prep_access()

Again, we can use function edit_access() to complete the final details required, namely the URL at which each dataset can be downloaded from. Use the URL from we donloaded each data file in the first place (hint ☝️)

We can also edit details such as the name field to something more informative if required.

Remember to click on Save when you’re done editing.

edit_access()

biblio

Before we start filling this table in, we can use some base R to extract some of the information we require. In particular we can use function range() to extract the temporal and spatial extents of our data.

get date range

range(vst_perplotperyear$date, vst_mappingandtagging$date) 
## [1] "05/22/15" "11/18/15"

get geographical extent

South/North boundaries
range(vst_perplotperyear$decimalLatitude)
## [1] 42.39229 44.06795
West/East boundaries
range(vst_perplotperyear$decimalLongitude)
## [1] -72.26573 -71.28145
edit_biblio()

attributes

data_files <- list.files(here::here("data"),
                         pattern = ".csv",
                        full.names = TRUE)

data_files
## [1] "/Users/Anna/Documents/workflows/workshops/dataspice-tutorial/data/vst_mappingandtagging.csv"
## [2] "/Users/Anna/Documents/workflows/workshops/dataspice-tutorial/data/vst_perplotperyear.csv"
data_files %>% purrr::map(~prep_attributes(.x))
edit_attributes()

create metadata json file

write_spice()
## Parsed with column specification:
## cols(
##   title = col_character(),
##   description = col_character(),
##   datePublished = col_date(format = ""),
##   citation = col_character(),
##   keywords = col_character(),
##   license = col_character(),
##   funder = col_character(),
##   geographicDescription = col_character(),
##   northBoundCoord = col_double(),
##   eastBoundCoord = col_double(),
##   southBoundCoord = col_double(),
##   westBoundCoord = col_double(),
##   wktString = col_character(),
##   startDate = col_date(format = ""),
##   endDate = col_date(format = "")
## )
## Parsed with column specification:
## cols(
##   fileName = col_character(),
##   variableName = col_character(),
##   description = col_character(),
##   unitText = col_character()
## )
## Parsed with column specification:
## cols(
##   fileName = col_character(),
##   name = col_character(),
##   contentUrl = col_character(),
##   fileFormat = col_character()
## )
## Parsed with column specification:
## cols(
##   id = col_character(),
##   givenName = col_character(),
##   familyName = col_character(),
##   affilitation = col_character(),
##   email = col_character()
## )

build README site

build_site()